The Quest for a Cure
Let’s learn why no universal remedy against SQL Injection has been discovered yet.
Now that we know the threat of SQL Injection, the next natural question is, what do we need to do to protect code from being exploited? We may sometimes read a blog or an article that describes some single technique and claims it to be the universal remedy against SQL Injection. In reality, none of these techniques provides security against every form of SQL Injection, so we need to use all of them in different cases.
Escaping values#
The oldest way to protect SQL queries from accidental unmatched quote characters is to escape any quote characters to prevent them from becoming the end of the quoted string. In standard SQL, we can use two quote characters to make one literal quote character:
Most brands of the database also support the backslash to escape the following quote character, just like most other programming languages do:
The idea is that we transform application data before we interpolate it into SQL strings. Most SQL programming interfaces provide a convenience function. For example, in PHP’s PDO extension, we use the quote()
function to both delimit a string with quote characters and escape any literal quote characters within the string.
<?php
$project_name = $pdo->quote($_REQUEST["name"]);
$sql = "SELECT * FROM Projects WHERE project_name = $project_name";
?>
This technique can reduce the risk of SQL Injection resulting from unmatched quote characters within the dynamic content. But it doesn’t work as well for non-string content.
<?php
$password = $pdo->quote($_REQUEST["password"]);
$userid = $pdo->quote($_REQUEST["userid"]);
$sql = "UPDATE Accounts SET password_hash = SHA2($password) WHERE account_id = $userid";
?>
The alternate code in SQL is provided in the following playground.
We can’t compare a numeric column directly with a string containing digits in all brands of the database. Some databases may implicitly cast the string to a sensible numeric equivalent. Still, in standard SQL, we have to deliberately use the CAST()
function to convert a string to a numeric data type.
There are also obscure corner cases where strings in non-ASCII character sets can pass through a function intended to escape the quote characters but leave unescaped quote characters intact. This is an example of this.
<?php
$stmt = $pdo->prepare("SELECT * FROM Projects WHERE project_name = ?");
$params = array($_REQUEST["name"]);
$stmt->execute($params);
?>
Many programmers recommend this solution because we don’t have to escape dynamic content or worry about flawed escaping functions. In fact, query parameters are a very strong defense against SQL Injection. But parameters aren’t a universal solution because the value of a query parameter is always interpreted as a single literal value.
No lists of values can be a single parameter#
Sometimes the user can write a list of values separated by a comma or any other separator character. Still, this approach doesn’t work as expected because we cannot use a single parameter to list multiple values.
<?php
$stmt = $pdo->prepare("SELECT * FROM Bugs WHERE bug_id IN ( ? )");
$stmt->execute(array("1234,3456,5678"));
?>
This works as though we provided a single string value composed of digits and commas, which don’t work the same as a series of integers:
The widget only shows the record of 1234.
No table identifier can be a parameter#
Sometimes the user can write the table name as a string which is the wrong approach. Writing a string instead of writing the table name must result in a syntax error.
<?php
$stmt = $pdo->prepare("SELECT * FROM ? WHERE bug_id = 1234");
$stmt->execute(array("Bugs"));
?>
This works as though we had entered a string literal in place of the table name, which is simply a syntax error:
No column identifier can be a parameter#
Sometimes the user can write the column name of a table as a string which is the wrong approach: it will not work as expected if we try to write a string instead of writing the column name.
<?php
$stmt = $pdo->prepare("SELECT * FROM Bugs ORDER BY ?");
$stmt->execute(array("date_reported"));
?>
In this example, the sort is a no-op because the expression is a constant string, the same on every row:
No SQL keyword can be a parameter#
Sometimes the user can write the keyword as a string which is the wrong approach. Writing a string instead of writing the keyword must result in a syntax error.
<?php
$stmt = $pdo->prepare("SELECT * FROM Bugs ORDER BY date_reported ?");
$stmt->execute(array("DESC"));
?>
The parameter is interpreted as a literal string, not an SQL keyword. In this example, the result is a syntax error.
Stored procedures#
The use of stored procedures is another method that many software developers claim is proof against SQL Injection vulnerabilities. Typically, stored procedures contain fixed SQL statements, parsed when we define the procedure.
However, it’s possible to use dynamic SQL in stored procedures unsafely. In the following example, the input_userid
argument is interpolated into the SQL query verbatim, which is unsafe.
Using dynamic SQL in a stored procedure is no more and no less safe than using dynamic SQL in application code. The input_userid
argument can contain harmful content and produce an unsafe SQL statement:
What was my complete query?
Data access frameworks#
We might see advocates of data access frameworks claim that their library protects our code from SQL Injection risks. This is a false claim for any framework that allows us to write SQL statements as strings.
Practice good hygiene
After I gave a presentation on a PHP data access framework that I had developed, a member of the audience approached me and asked, “Does your framework prevent SQL Injection?” I answered that it provided functions for quoting strings and using query parameters.
The questioner looked puzzled. “But can it prevent SQL Injection?” they repeated. They were looking for an automatic way to ensure that they don’t make a mistake that they don’t know how to recognize themselves.
I told them that the framework prevents SQL Injection like a toothbrush prevents cavities. We have to use it consistently to get the benefit.
No framework can force us to write safe SQL code. A framework may provide convenience functions to help us, but it’s easy to bypass these functions and use common string manipulation to build an SQL statement unsafely.